Fixing porter can strand parcels until daemon restart by sergey3bv · Pull Request #2132 · lightninglabs/taproot-assets

sergey3bv · 2026-05-13T12:53:59Z

Should close #2120

gemini-code-assist · 2026-05-13T12:54:09Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a bug where sweep parcels could become stranded in the chain porter if a failure occurred after the initial response was sent to the caller. It introduces a robust retry mechanism for specific post-delivery states and optimizes how sweeper-originated transactions interact with the porter to prevent unnecessary broadcasts.

Highlights

Post-delivery failure recovery: Implemented a retry mechanism for recoverable post-delivery failures in the chain porter, allowing stranded parcels to be requeued with exponential backoff instead of requiring a daemon restart.
Sweeper transaction optimization: Updated sweeper-originated channel transactions to skip redundant porter broadcasts while ensuring local addresses are still correctly imported.

New Features

🧠 You can now enable Memory (public preview) to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a retry mechanism for the chain porter to handle recoverable post-delivery failures, such as those occurring during transaction confirmation or proof transfer. It also refactors the state machine to ensure local addresses are imported even when anchor transaction broadcasts are skipped. Review feedback highlighted a potential deadlock in the advanceState goroutine when handling background parcels with unbuffered error channels. Other improvements suggested include clearing retry bookkeeping upon permanent failure to prevent memory leaks and implementing a maximum retry limit to avoid infinite loops.

kaldun-tech

I think the change here is solid and addresses the failure case. Some minor suggestions related to improving documentation and adding a test case for shutdown

kaldun-tech · 2026-05-28T22:09:33Z

-	daveSweepTxHash := daveSweepBlocks[0].Transactions[1].TxHash()
+	daveSweepTxHash := resolveMinedTransferTxid(
+		t.t, dave, daveSweepBlocks[0],
+	)


We could use more context on the design decisions here. The old code failed with RBF (replace-by-fee) because it assumed the coinbase transaction at index 0, and sweep transaction is at index 1 in the block. And there's only one non-coinbase transaction in the block. The RBF sweep transactions invalidated this, or if multiple transactions are mined in the same block.

The new approach with resolveMinedTransferTxid finds which transaction in the block tapd know is a transfer. So decouples the tests from block transaction ordering, RBF replacements, multiple transactions in a block and tapd timing issues.

kaldun-tech · 2026-05-28T22:15:26Z


+	closeStream, _, err := net.CloseChannel(local, chanPoint, false)
+	require.NoError(t.t, err)
+


Moving the close after the SubscribeSendEvents fixes a race condition:

CloseChannel starts the close process

tapd processes close, fires SendStateComplete event

SubscribeSendEvents called — but event already fired

waitForSendEvent blocks forever waiting for missed event

By subscribing first, all events from the close operation are captured. This implements the standard "subscribe before action" pattern

So it's good we could use more comments IMO

kaldun-tech · 2026-05-28T22:35:03Z

+		pollInterval    = 200 * time.Millisecond
+	)
+
+	require.Eventually(t, func() bool {


This will get us more detailed failure messages about the state and make the testing more robust

kaldun-tech · 2026-05-28T22:36:22Z

+		blockHashSet    bool
+		blockHeight     uint32
+		blockHeightHint uint32
+		pollInterval    = 200 * time.Millisecond


Minor: 200 millis gets used in a few different spots, consider intorducing a constant

kaldun-tech · 2026-05-28T22:43:30Z

+	case SendStateStorePostAnchorTxConf:
+		fallthrough
+	case SendStateTransferProofs:
+		return true


Minor: The fallthrough style is unusual. More idiomatic:

switch state { case SendStateWaitTxConf, SendStateStorePostAnchorTxConf, SendStateTransferProofs: return true default: return false }

Or even simpler:

return state == SendStateWaitTxConf || state == SendStateStorePostAnchorTxConf || state == SendStateTransferProofs

kaldun-tech · 2026-05-29T17:30:13Z

+	if pkg == nil || pkg.OutboundPkg == nil ||
+		pkg.OutboundPkg.AnchorTx == nil {
+
+		return


Invariant: we can only add to postDeliveryRetryAttempts if AnchorTx is non-nil, and only need to delete it in the same case

kaldun-tech · 2026-05-29T17:37:32Z

+
+			return nil, fmt.Errorf("unable to import local "+
+				"addresses: %w", err)
+		}


This block imports the taproot output keys into lnd's wallet for outputs that belong to us. So the fix ensures addresses are always imported, regardless of who broadcasts.

It's a solid change. Think it would benefit from more comments.

kaldun-tech · 2026-05-29T17:39:46Z

 				"disk: %w", err)
 		}

+		ctx, cancel = p.WithCtxQuitNoTimeout()


Gives us a new context for importLocalAddresses which is idempotent and can be retried. As opposed to the CtxBlocking context (line 2189) for LogPendingParcel which is a critical write to disk that must complete even during shutdown.

kaldun-tech · 2026-05-29T17:48:12Z

+	}
+}
+
+func TestAdvanceStatePermanentFailureClearsRetryBookkeeping(t *testing.T) {


Nit: The name suggests "permanent failure after max attempts". It actually tests non-recoverable state failure

kaldun-tech · 2026-05-29T17:49:35Z

+		t.Fatalf("did not expect pending parcel re-queue")
+	case <-time.After(100 * time.Millisecond):
+	}
+}


Minor: Consider adding a test case for process shutdown similar to
`func TestSchedulePostDeliveryRetryShutdownDuringDelay(t *testing.T) {
porter := newTestChainPorter()
pkg := newTestSendPackage(SendStateTransferProofs)

recoverable := porter.schedulePostDeliveryRetry( pkg, SendStateTransferProofs, errors.New("failure"), ) require.True(t, recoverable) // Close quit before timer fires (base delay is 1 second) close(porter.Quit) // Verify goroutine exits cleanly, no re-queue select { case <-porter.outboundParcels: t.Fatalf("did not expect parcel after shutdown") case <-time.After(1500 * time.Millisecond): // Success }

}`

lightninglabs-deploy · 2026-06-05T18:38:21Z

@sergey3bv, remember to re-request review from reviewers when ready

github-project-automation Bot added this to Taproot-Assets Project Board May 13, 2026

github-project-automation Bot moved this to 🆕 New in Taproot-Assets Project Board May 13, 2026

sergey3bv force-pushed the fix/strand-parcels branch from 214b5d6 to b7633c0 Compare May 13, 2026 12:55

sergey3bv marked this pull request as ready for review May 13, 2026 12:55

gemini-code-assist Bot reviewed May 13, 2026

View reviewed changes

Comment thread tapfreighter/chain_porter.go Outdated

Comment thread tapfreighter/chain_porter.go

Comment thread tapfreighter/chain_porter.go

sergey3bv force-pushed the fix/strand-parcels branch 3 times, most recently from 97bc49d to e1b2466 Compare May 14, 2026 07:40

sergey3bv added 2 commits May 14, 2026 11:59

fix(tapfreighter): recover stranded sweeper parcels without restart

b054513

chore: updated release notes

76f2208

sergey3bv force-pushed the fix/strand-parcels branch from e1b2466 to 76f2208 Compare May 14, 2026 08:59

kaldun-tech mentioned this pull request May 28, 2026

taprpc+tapdb: add GenesisInfo to AssetGroupBalance in ListBalances #2130

Open

5 tasks

kaldun-tech approved these changes May 29, 2026

View reviewed changes


		closeStream, _, err := net.CloseChannel(local, chanPoint, false)
		require.NoError(t.t, err)

Conversation

sergey3bv commented May 13, 2026

Uh oh!

gemini-code-assist Bot commented May 13, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kaldun-tech left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lightninglabs-deploy commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants